nlp_architect.models.chunker.SequenceChunker

class nlp_architect.models.chunker.SequenceChunker(use_cudnn=False)[source]

A sequence Chunker model written in Tensorflow (and Keras) based SequenceTagger model. The model uses only the chunking output of the model.

__init__(use_cudnn=False)

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__([use_cudnn])

Initialize self.

build(vocabulary_size, num_pos_labels, …)

Build a chunker/POS model

fit(x, y[, batch_size, epochs, …])

Fit provided X and Y on built model

load(filepath)

Load model from disk

load_embedding_weights(weights)

Load word embedding weights into the model embedding layer

predict(x[, batch_size])

Predict labels given x.

save(filepath)

Save the model to disk

build(vocabulary_size, num_pos_labels, num_chunk_labels, char_vocab_size=None, max_word_len=25, feature_size=100, dropout=0.5, classifier='softmax', optimizer=None)

Build a chunker/POS model

Parameters
  • vocabulary_size (int) – the size of the input vocabulary

  • num_pos_labels (int) – the size of of POS labels

  • num_chunk_labels (int) – the sie of chunk labels

  • char_vocab_size (int, optional) – character vocabulary size

  • max_word_len (int, optional) – max characters in a word

  • feature_size (int, optional) – feature size - determines the embedding/LSTM layer hidden state size

  • dropout (float, optional) – dropout rate

  • classifier (str, optional) – classifier layer, ‘softmax’ for softmax or ‘crf’ for conditional random fields classifier. default is ‘softmax’.

  • optimizer (tensorflow.python.training.optimizer.Optimizer, optional) – optimizer, if None will use default SGD (paper setup)

fit(x, y, batch_size=1, epochs=1, validation_data=None, callbacks=None)

Fit provided X and Y on built model

Parameters
  • x – x samples

  • y – y samples

  • batch_size (int, optional) – batch size per sample

  • epochs (int, optional) – number of epochs to run before ending training process

  • validation_data (optional) – x and y samples to validate at the end of the epoch

  • callbacks (optional) – additional callbacks to run with fitting

load(filepath)

Load model from disk

Parameters

filepath (str) – file name of model

load_embedding_weights(weights)

Load word embedding weights into the model embedding layer

Parameters

weights (numpy.ndarray) – 2D matrix of word weights

predict(x, batch_size=1)[source]

Predict labels given x.

Parameters
  • x – samples for inference

  • batch_size (int, optional) – forward pass batch size

Returns

tuple of numpy arrays of chunk labels

save(filepath)

Save the model to disk

Parameters

filepath (str) – file name to save model